Goto

Collaborating Authors

 nuclear norm solution


5927edd18c5dd83aa8936a4610c72029-Supplemental-Conference.pdf

Neural Information Processing Systems

In this section, we examine our theoretical results with controlled experiments via synthetic data. We do not have a complete explanation for such spikes. At first glance, overfitting could happen when the number of linear measurements is less than the size of the groundtruth matrix. Moreover, when the measurements satisfy RIP, Li et al. Soltanolkotabi [ 45 ] show that GD exactly recovers the ground truth. To our best knowledge, most existing generalization analysis for flat regularization are for two-layer models, e.g., Li et al.


Implicit Regularization in Matrix Factorization

Suriya Gunasekar, Blake E. Woodworth, Srinadh Bhojanapalli, Behnam Neyshabur, Nati Srebro

Neural Information Processing Systems

This generalization ability cannot be explained by the capacity of the explicitly specified model class (namely, the functions representable in the chosen architecture). Instead, it seems that the optimization algorithm biases us toward a "simple" model, minimizing



Implicit Regularization in Matrix Factorization

Neural Information Processing Systems

We study implicit regularization when optimizing an underdetermined quadratic objective over a matrix X with gradient descent on a factorization of X. We conjecture and provide empirical and theoretical evidence that with small enough step sizes and initialization close enough to the origin, gradient descent on a full dimensional factorization converges to the minimum nuclear norm solution.


Implicit Regularization in Matrix Factorization

Gunasekar, Suriya, Woodworth, Blake E., Bhojanapalli, Srinadh, Neyshabur, Behnam, Srebro, Nati

Neural Information Processing Systems

We study implicit regularization when optimizing an underdetermined quadratic objective over a matrix $X$ with gradient descent on a factorization of X. We conjecture and provide empirical and theoretical evidence that with small enough step sizes and initialization close enough to the origin, gradient descent on a full dimensional factorization converges to the minimum nuclear norm solution.


Implicit Regularization in Matrix Factorization

Gunasekar, Suriya, Woodworth, Blake, Bhojanapalli, Srinadh, Neyshabur, Behnam, Srebro, Nathan

arXiv.org Machine Learning

We study implicit regularization when optimizing an underdetermined quadratic objective over a matrix $X$ with gradient descent on a factorization of $X$. We conjecture and provide empirical and theoretical evidence that with small enough step sizes and initialization close enough to the origin, gradient descent on a full dimensional factorization converges to the minimum nuclear norm solution.